Optimization of Common Table Expressions in MPP Database Systems
نویسندگان
چکیده
Big Data analytics often include complex queries with similar or identical expressions, usually referred to as Common Table Expressions (CTEs). CTEs may be explicitly defined by users to simplify query formulations, or implicitly included in queries generated by business intelligence tools, financial applications and decision support systems. In Massively Parallel Processing (MPP) database systems, CTEs pose new challenges due to the distributed nature of query processing, the overwhelming volume of underlying data and the scalability criteria that systems are required to meet. In these settings, the effective optimization and efficient execution of CTEs are crucial for the timely processing of analytical queries over Big Data. In this paper, we present a comprehensive framework for the representation, optimization and execution of CTEs in the context of Orca – Pivotal’s query optimizer for Big Data. We demonstrate experimentally the benefits of our techniques using industry standard decision support benchmark.
منابع مشابه
An Authorization Framework for Database Systems
Today, data plays an essential role in all levels of human life, from personal cell phones to medical, educational, military and government agencies. In such circumstances, the rate of cyber-attacks is also increasing. According to official reports, data breaches exposed 4.1 billion records in the first half of 2019. An information system consists of several components, which one of the most im...
متن کاملfAST Refresh using Mass Query Optimization
Automatic Summary Tables (ASTs), more commonly known as materialized views, are widely used to enhance query performance, particularly for aggregate queries. Such queries access a huge number of rows to retrieve aggregated summary data while performing multiple joins in the context of a typical data warehouse star schema. To keep ASTs consistent with their underlying base data, the ASTs are eit...
متن کاملExploring Query Optimization Techniques in Relational Databases
In the modern era, digital data is considered as the more valuable asset of an organization, and the organizations assign more significance to it than the software and hardware assets. Database systems are computer-based record keeping systems, which have been developed to store data for efficient retrieval and processing. One particular approach is the relational databases in which all the inf...
متن کاملManaging Expressions as Data in Relational Database Systems
A wide-range of applications, including Publish/Subscribe, Workflow, and Web-site Personalization, require maintaining user’s interest in expected data as conditional expressions. This paper proposes to manage such expressions as data in Relational Database Systems (RDBMS). This is accomplished 1) by allowing expressions to be stored in a column of a database table and 2) by introducing a SQL E...
متن کاملبررسی نقش عوامل مؤثر بر فراوانی حوادث در لولههای اصلی آب رسانی با استفاده از الگوی رگرسیونی ترکیبی
A water distribution network is one of the important parts of infrastructure systems. The efficient management and proactive planning of capital investment of these assets are fundamental for efficient and effective service delivered by water companies. The direct economic costs (i.e. rehabilitation investment, repair costs, water loss, etc.) as well as indirect costs (i.e. service and traffic ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- PVLDB
دوره 8 شماره
صفحات -
تاریخ انتشار 2015